A Performance Evaluation of the Nehalem Quad-Core Processor for Scientific Computing

نویسندگان

  • Kevin J. Barker
  • Kei Davis
  • Adolfy Hoisie
  • Darren J. Kerbyson
  • Michael Lang
  • Scott Pakin
  • José Carlos Sancho
چکیده

In this work we present an initial performance evaluation of Intel's latest, secondgeneration quad-core processor, Nehalem, and provide a comparison to first-generation AMD and Intel quad-core processors Barcelona and Tigerton. Nehalem is the first Intel processor to implement a NUMA architecture incorporating QuickPath Interconnect for interconnecting processors within a node, and the first to incorporate an integrated memory controller. We evaluate the suitability of these processors in quad-socket compute nodes as building blocks for large-scale scientific computing clusters. Our analysis of intra-processor and intra-node scalability of microbenchmarks, and a range of large-scale scientific applications, indicates that quad-core processors can deliver an improvement in performance of up to 4x over a single core depending on the workload being processed. However, scalability can be less when considering a full node. We show that Nehalem outperforms Barcelona on memory-intensive codes by a factor of two for a Nehalem node with 8 cores and a Barcelona node containing 16 cores. Further optimizations are possible with Nehalem, including the use of Simultaneous Multithreading, which improves the performance of some applications by up to 50%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Early experiences and results on parallelizing discrete dislocation dynamics simulations on multi-core architectures

Materials science simulations are among the leading applications for scientific supercomputing. Discrete dislocation dynamics (DDD) is a numerical tool used to model the plastic behavior of crystalline materials using the elastic theory of dislocations. DDD simulations require very long running times to produce meaningful scientific results. This work presents early experiences and results on i...

متن کامل

Parallel implementation of the wideband DOA algorithm on single core, multicore, GPU and IBM cell BE processor

The Multiple Signal Classification (MUSIC) algorithm is a powerful technique for determining the Direction of Arrival (DOA) of signals impinging on an antenna array.The algorithm is serial based, mathematically intensive, and requires substantial computing power to realize in real-time.Recently, multi-core processors are becoming more prevalent and affordable.The challenge of adapting existing ...

متن کامل

Parallelizing discrete dislocation dynamics simulations on multi-core systems

Materials science simulations are among the leading applications for scientific supercomputing. Discrete dislocation dynamics (DDD) is a numerical tool used to model the plastic behavior of crystalline materials using the elastic theory of dislocations. DDD simulations require very long running times to produce meaningful scientific results. This paper presents early experiences and results on ...

متن کامل

Parallel Performance Studies for a Three-Species Application Problem on the Cluster tara

High performance parallel computing depends on the interaction of a number of factors including the processors, the architecture of the compute nodes, their interconnect network, and the numerical code. In this note, we present performance and scalability studies on the cluster tara using a well established parallelized code for a three-species application problem. This application problem requ...

متن کامل

Performance Evaluation of Intel's Quad Core Processors for Embedded Applications

Recently, multiprocessing is implemented using either chip multiprocessing (CMP) or Simultaneous multithreading (SMT). Multi-core processors, represent CMP processors, are widely used in desktop and server applications and are now appearing in real-time embedded applications. We are investigating optimal configurations of some of the available multi-core processors suitable for developing real-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Parallel Processing Letters

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2008